(19) 



(12) 



(43) Date of publication: 

16.07.1997 Bulletin 1997/29 

(21) Application number: 97100342.1 

(22) Date of filing: 10.01.1997 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets (1 1) EP 0 784 285 A2 

EUROPEAN PATENT APPLICATION 

(51) Ini CI. 6 : G06K 9/62 



(84) Designated Contracting States: 
DE FR GB IT NL 



(30) Priority: 



12.01.1996 JP 3836/96 
26.02.1996 JP 37816/96 
12.04.1996 JP 91091/96 
12.04.1996 JP 91097/96 



(71) Applicant: CANON KABUSHIKI KA1SHA 
Tokyo (JP) 



(72) Inventors: 

• Hlroto, Yoshii 
Ohta-ku, Tokyo 146 (JP) 

• Tsunekazu, Aral 
Ohta-ku, Tokyo 146 (JP) 

• Eiji,Takasu 

Ohta-ku, Tokyo 146 (JP) 

(74) Representative: Grams, Klaus Dieter, Dipl.-lng. 
PatentanwaHsbOro 
Tledtke-BQhllng-KJnne & Partner 
Bavarlaring 4 
80336 MQnchen (DE) 



TRANMQ 9TB0KE 



0MDE BTOCKE WTO 
STROKE SEGMENTS 



HAKE STROKE 
SEGMENTS VECTORS 



(54) Method and apparatus for generating a classification tree 

(57) The present invention relates to a classification 
. generation method whereby, in order to eff idently and 
accurately recognize a pattern having a large number of 
characteristics, a pattern classification tree is gener- 
ated, with which a macro structural characteristic of a 
pattern is appropriately reflected and a competitive rela- 
tionship between categories is adequately reflected, 
and to a method for recognizing an input pattern by 
using the generated classification tree. 

When an input pattern is formed using strokes, a 
training stroke is divided into a plurality of segments, 
and vector quantization is performed for the strokes in 
the segments. Among the quantized strokes in the seg- 
ments, adjacent stroke sets are synthesized to repeti- 
tively generate upper rank stroke vectors. A stroke 
vector for which a predetermined entropy function is 
maximized is selected from the upper rank stroke vec- 
tors in a layered stroke vector series, and development 
is performed extending down into the lower rank stroke 
vector sets. As a result, a classification tree is prepared. 
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Description 

BACKGROUND OF THE INVENTION 
s Field of the Invention 

The present invention relates to recognition of patterns, such as character and speech patterns, and more partic- 
ularly, to a technique for preparing data for pattern recognition of characters, sounds, etc. 

10 Related Background Art 

Conventionally, for recognition of handwritten characters, which constitute a type of pattern, one step-by-step pro- 
cedure utilizes a classification tree to sort patterns into categories. 

Since with the conventional recognition method tor using a classification tree, to prepare nodes the focus is only on 
the number of characteristics of individual nodes, the broader aspects of the pattern can not be determined 

In order to make a classification tree for recognition of a pattern having a large amount of characteristics, a method 
for selecting a characteristic axis at the individual nodes must be employed because of the time requred for calculation 
In addition, there is a conventional method, which utilizes an N-gram table and which is employed for sentence rec- 
ognition, whereby a finite automaton is used as a language model for the constitution of sentences, and whereby, based 
on this model, the preprobability of the occurrence of a character row is calculated. 

In other words, according to this method, a step of calculating, from large-scale sentence database, the probability 
concerning the continuation of element rows that constitute sentences. 

However, for a language, such as Japanese or Chinese, that indudes several thousands of character types, a large 
amount of sentence data is required even to prepare a trigram table (N ■ 3). 

If a table is to be prepared using a small amount of sentence data, a reliable shifting probability and an unreliable 
shifting probability coexist in the table, and a defect occurs. 

A conventional method for preparing a classification tree through pre-processing that involves the step-by-step 
degeneration of a pattern. According to this method, a well balanced classification tree can be constructed for the 
macro to the micro form of a pattern. As a result, a recognition function that is as close as possible to the recoqnition 
30 ability of human beings can be expected. 

However, since this method absorbs modifications of a pattern by using a variety of training patterns, an enormous 
amount of training patterns is required. 

This condition will be explained while referring to Fig. 32. 

Suppose that a classification tree is prepared according to the conventional method for the recognition of numerical 
35 bit maps ranging from "(T through "9". 

A classification tree constructed by the above method is shaped as shown in Fig. 32. Training patterns for three cat- 
egories. "4 W , "5" and "6", are present at the fifth branch from the right in Fig. 32. 

In other words, broadly speaking, no categories other than the three categories "4". "5" and "6" are available for the 
training patterns at the fifth branch from the right in Fig. 32. 
40 ^ an exanple, consider the proc^^ 

vided classification tree. Broadly speaking, all the bit maps shown in Figs. 41 A through 41E have the same shape as 
the fifth branch from the right in Rg. 32. In other words, when the above explained classification tree is used for recog- 
nition of these bit maps, the bit maps are always classified as belonging to categories of "4\ "5T and "6- As the result 
the bit maps in Figs. 41 A through 41 C are correctly identified, but the bit map in Rg. 41D, which is identified, should be 
45 rejected, and the one in Rg. 41 E is apparently incorrectly identified. 

The reason such a defect occurs is that there is no pattern having the category "2" that is shaped like the one in 
Fig 41 E. This means that for the conventional method, an enormous quantity of training patterns, which include all pos- 
sible permutations, are required. . ~ - r 

so SUMMARY OF THE INVENTION 

It is therefore one object of the present invention to provide a classification tree generation method for generating 
a classification tree composed of stroke vectors, with which the macro structural characteristic of a pattern that has a 
large amount of characteristics is appropriately reflected and with which the competitive relationship that exists among 
55 categories is appropriately reflected, and an apparatus therefor; and to provide a character recognition method 
whereby a generated classification tree is used to recognize characters at high speed and at a high recognition ratio 
and an apparatus therefor. 

According to the present invention, a layered character pattern can be efficiently generated from a character pat- 
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In addition, based on a generated layered character pattern, a classification tree can be prepared wherein the com- 
petition of categories is most intense at the upper layer, and wherein the categories are preferably sorted at the layer 
immediately below. 

Further, a memory-efficient N-gram table can be generated by using the produced classification tree. 
5 Moreover, recognition at high speed and at a high recognition ratio can be performed by searching the thus 
acquired N-gram table. 

According to the present invention, sub-patterns are extracted from training patterns, and layering is performed in 
advance for the sub-patterns. Based on the layered sub-patterns, a classification tree for the sub-patterns is prepared, 
so that a high recognition ratio can be provided even with a small quantity of training patterns. 
10 According to the present invention, in the layering process, data are produced from the sequential degeneration of 
detailed sub-pattern data> so that fast recognition process can be provided. 

According to the present invention, the classification tree is prepared by developing the layered sub-pattern data 
from the upper rank through the lower rank, so that dictionary data having a high recognition efficiency can be provided. 

According to the present invention, when sub-patterns are regarded as pattern segments obtained by dividing a 
is training pattern, the preparation of sub-patterns is easy. 

According to the present invention, a variable for which efficiency of classification is the greatest is selected, and a 
classification tree is prepared for the selected variable. As a result, an efficient classification tree that differs from the 
conventional one can be provided. 

According to the present invention, layering is performed on an input pattern. The layered input pattern is recog- 
20 nized by tracing the classification tree, beginning at the upper rank data for the pattern and continuing to the lower rank. 
As a result, a high recognition rate at a high speed can be provided. 

According to the present invention, when the pattern is composed of bit-mapped data, highly effective identification 
of image data input by a scanner, etc., can be performed. 

According to the present invention, when a pattern is stroke data, highly effective identification of tracing data input 
25 by a pen can be performed. 

According to the present invention, when a pattern is speech data, highly effective identification of speech data 
input at a microphone, etc., can be performed. 

BRIEF DESCRIPTION OF THE DRAWINGS 

30 

Fig. 1 is a block diagram illustrating the arrangement of an apparatus according to a first embodiment of the present 
invention; 

Fig. 2 is a flowchart showing a method for generating an on-line handwritten character recognition dictionary for the 
first embodiment; 

35 Fig. 3 is a flowchart of the processing for generating an on-line handwritten character recognition dictionary for the 
first embodiment; 

Fig. 4 is a diagram for explaining the processing for a stroke generation phase in the first embodiment; 
Fig. 5 is a diagram showing a layered vector series; 

Fig. 6 is a diagram for explaining a vector averaging process in the first embodiment; 
40 Fig. 7 is a diagram showing a classification tree for the first embodiment; 

Rg. 8 is a diagram showing an example data configuration of the classification tree for the on-line handwritten char- 
acter recognition dictionary in the first embodiment; 

Rg. 9 is a flowchart showing an on-line handwritten character recognition method for the first embodiment; 
Rg. 10 is a detailed flowchart for a classification tree generation process for the first embodiment; 
45 Rg. 1 1 is a diagram illustrating an example fa the first embodiment of the generation of branches at step S1007 in 
Rg. 10; 

Rg. 12 is a diagram illustrating a first arrangement of the apparatus according to the first embodiment; 
Rg. 13 is a diagram illustrating a second arrangement of the apparatus according to the first embodiment; 
Fig. 14 is a block diagram illustrating the arrangement of an apparatus according to a second embodiment; 
so Rg. 15 is a conceptual diagram showing information processing according to the second embodiment; 

Rg. 16 is a diagram illustrating a neural network having a pyramid shape that is a part of the processing in the sec- 
ond embodiment; 

Rg. 17 is a flowchart showing information processing according to the second embodiment; 
Rg. 18 is a diagram illustrating an example training pattern according to the second embodiment; 
55 Rg. 19 is a diagram illustrating an example of layered training patterns according to the second embodiment; 
Rg. 20 is a diagram showing a classification tree generation process according to the second embodiment; 
Rg. 21 is a diagram illustrating an example classification tree that is generated according to the second embodi- 
ment; 

Fig. 22 is a diagram showing an example of the grouping of large categories according to the second embodiment; 
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Fig. 23 is a flowchart illustrating second processing according to the second embodiment; 
Rg. 24 is a second flowchart for the information processing according to the second embodiment; 
Fig. 25 is a diagram illustrating an example memory layout, with program modules, according to the second embod- 
iment; 

Rg. 26 is a diagram illustrating the hardware arrangement of an apparatus according to a third embodiment; 
Rg. 27 is a diagram showing a classification tree preparation process according to the third embodiment; 
Rg. 28 is a diagram illustrating the arrangement of the apparatus according to the third embodiment; 
Rg; 29 is a flowchart showing the processing for the third embodiment; 

Rg. 30 is a diagram for explaining the extraction of sub-patterns according to the third embodiment; 

Rg. 3 1 is a diagram illustrating the configuration of a pyramid according to the third embodiment; 

Fig. 32 is a diagram showing a classification tree that is being prepared according to the third embodiment; 

Fig. 33 is a flowchart showing the classification preparation processing according to the third embodiment; 

Rg. 34 is a diagram showing layered patterns at the lower rank that are generated by selected neurons according 

to the third embodiment; 

Rg. 35 is a diagram illustrating a classification tree that is finally prepared according to the third embodiment; 

Rg. 36 is a diagram showing recognition processing according to the third embodiment; 

Rg. 37 is a diagram illustrating a classification tree preparation process according to the third embodiment; 

Rg. 38 is a flowchart showing second processing according to the third embodiment; 

Fig. 39 is a diagram illustrating sub-vector extraction means according to the third embodiment; 

Rg. 40 is a diagram illustrating second recognition processing according to the third errfcodiment; 

Rgs. 41A, 41B, 41C, 41D and 41 E are diagrams illustrating prior art; 

Rg. 42 is a diagram showing envelopes obtained by performing a Fourier transformation on a speech pattern 
according to the third embodiment; 

Rg. 43 is a graph showing speech patterns with intensity and a frequency represented along axes; and 
Rg. 44 is a graph showing speech patterns with frequency and time represented along axes. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

(First Embodiment) 

In a first embodiment, a description will be given of a method for generating a classification tree for recognizing an 
entered stroke online and the recognition processing by using the generated classification tree. 

Rrst. the major features of a method and an apparatus for recognizing characters according to this embodiment will 
be described, then detailed description thereof will follow. 

In the character recognizing method according to the embodiment, a training stroke is divided into stroke segments 
which are formed into vectors, and resulting vector series are layered, then a classification tree is generated according 
to the obtained layered vector series. 

In a process for layering the vector series, the vector series information constituting the training stroke is degener- 
ated in steps. 

When generating the classification tree, a vector is selected which ensures the severest competition among cate- 
gories in an upper layer and good separation among categories in a layer immediately thereunder in accordance with 
an entropy standard which will be discussed later, and the vector which has been degenerated according to the result 
thereof is developed toward lower layers. 

A dictionary for online handwritten character recognition holds the foregoing classification tree as contents thereof. 

Further, the category of a stroke hand-drawn by a user is determined accorcing to the foregoing classification tree. 

The present invention will now be described in conjunction with the accompanying drawings. 

(Structure and generating method of classification tree ) 



Fig. 1 is a diagram showing an example of a schematic configuration of an information processing apparatus to 
which the method of online handwritten character recognition in accordance with the embodiment will be applied. 

An online handwritten character recognizing apparatus according to the embodiment is constituted primarily by a 
stroke input device 401 , a display 402, a central processing unit (CPU) 403, and a memory 404. 

The stroke input device 401 has, for example, a digitizer and a pen; it hands the coordinate data on a character or 
graphic, which has been entered on the digitizer by using the pen, over to the CPU 403. 

The display 402 displays stroke data entered through the stroke input device 401 and a result of recognition by the 
CPU 403. 

The CPU 403 recognizes a character or graphic composed of entered stroke data and also controls the entire 
apparatus. 
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The memory 404 records a recognizing program and a dictionary used by the CPU 403 and it also temporarily 
records entered stroke data, variables used by the recognizing program, etc. 

Fig. 2 is a processing flowchart which provides a best illustration of the procedure for generating the dictionary for 
online handwritten character recognition according to an embodiment of the present invention. 
5 Referring to Fig. 2, reference character S101 indicates a step for entering a training stroke, and S102 denotes a 
step for dividing the entered training stroke into stroke segments 

Reference character S103 denotes a step for making the stroke segments vectors, the stroke segments resulting 
from the division performed in the preceding stroke dividing step. 

Reference character S1 04 denotes a step of the pre-layering process on vector series that results from the preced- 
w ing step for making the stroke segments vectors. 

Reference character 105 denotes a layered vector series generated in the step of the pre-layering process on the 
vector series. 

Reference character S1 06 denotes a classification tree generating step for making a classification tree in accord- 
ance with the layered vector series. 
is Reference character S1 07 is a step for discriminating a development vector which is used in the process of gener- 
ating the classification tree in the classification tree making step. 

Reference character 108 denotes a classification tree that has been completed. 
In this embodiment, the input is a training stroke in S101 and the output is the classification tree 108. 
Referring now to Fig. 3 to Fig. 7, a description will be given to the procedure for generating a classification tree in 
20 a character recognizing process of the first embodiment according to the present invention. 
For easier understanding, three different characters 



w <\ and "o" 

25 

which read "ku", "shi", and "tsu", respectively; each of which is drawn in one stroke, will be taken as examples repre- 
senting the categories to be recognized. 

It is assumed that there are one hundred training patterns each for 

so n< „ f M L " , and "o", 

respectively, for generating the dictionary; these are denoted as follows: 
35 TPij (Training Pattern i, j) 

where i is a suffix denoting the category and it takes a value in the following range: 

j is a suffix denoting a training pattern number and it takes a value in the following range: 

40 1<£j£l00 

As illustrated by the flowchart shown in Fig. 3, the process of generating the dictionary for the online handwritten 
character recognition is composed of three steps, namely, a vector generation step, a pre-layering process step, and a 
classification tree generation step. The following will describe each of the steps. 

45 (F1 ) Vector generation step 

Referring to Fig. 4, the vector generation step will be described in detail. 

Firstly, the training stroke is divided into n segments (n = 8 in Fig. 4). Although the n segments shown in Fig. 4 are 
equal to the equally divided n segments related to the distance of the stroke, the present invention is not limited thereto. 
50 For instance, if a stroke input device suffers from unstable strokes in the vicinity of start and end points thereof, then 
it would be hardly meaningful to make efforts for obtaining detailed segment vectors from the stroke portion in the vicin- 
ity of start and end points thereof. In such a case, longer distances may be allowed for the beginning segment and the 
end segment out of the n stroke segments than those of the remaining segments. 

In the next step, the respective segments of the n stroke segments are formed into vectors. 
55 In Fig. 4, the stroke segments are quantized Into the base vectors in twelve directions from number 0 to number 1 1 . 
The base vectors are arranged equidistantly in 380 degrees; however, the present invention is not limited thereto. 

For example, of the base vectors in the twelve directions shown in Rg. 4, an upper left base vector (e.g. the vector 
numbered 10 or 1 1) does not appear in a handwritten stroke. Therefore, a set of base vectors with such base vectors 
arranged at a greater angle interval may be used. 
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In the process for producing the vectors, the step for dividing the stroke into stroke segments and the step for mak- 
ing each segment a vector shown in Fig. 2 are implemented on all the training strokes. 

In the case of the example shown in Fig. 4, the entered stroke is converted to base vector series "12455421". 

(F2) Pre-layering step 

The training strokes which have been formed into the vectors are pre-layered pyramidally. 
Fig. 5 shows an example. 

In Fig. 5. an average vector from two adjacent vectors of the vector series obtained in the step for making each seg- 
ment a vector is stored in an upper layer so as to sequentially reduce the vector information to a half (or degenerated). 

The eight base vectors of the stroke as shown in Fig. 4 will be eventually converted to four vectors, two vectors, and 
one vector in sequence. 

One method for averaging two adjoining vectors will be described in detail, referring to Fig. 6. 

For the convenience of description, the base vectors in twelve directions shown in Fig. 5 will be in eight directions 
in Fig. 6. It should be noted that the spirit of the present invention remains unaffected even if the total number of the 
base vectors or the directions of the individual base vectors are changed. 

The following description will be given on an assumption that the base vectors have eight directions. The first vector 
of the adjoining two vectors will be denoted as "pre", and the following vector as "post". 

In a simple way, the average of the two vectors may be given by: 

(pre + post)/2 

There are cases, however, that the average obtained from the above formula does not provide a base vector. 
In general, the vectors equally divided into eight directions and the average vectors thereof provide the vectors in 
sixteen directions, and they must be processed to provide vectors in eight directions. 
Fig. 6 illustrates a method therefor. 

In Fig. 6, "->" (800) means the presence of a rightward vector (No. 2) in an upper layer. The eight pairs of vectors 
given thereunder indicate the pairs of vectors that should exist in a lower layer. 

Specifically, there are the following eight pairs which may be the pair of vectors (pre, post) indicated by No 2 in an 
upper layer: 

(2. 2) , (1,3) 
(3.1). (0,4) 

(2. 3) , (3, 2) 

(3. 4) , (4. 3) 

This applies under a condition where the average value of pre and post obtained by (pre + post)/2 is greater than 
1.5 and 2.5 or smaller. 

If the vectors in an upper layer have a number other than 2, then a set of vectors which is obtained by shifting the 
set of vectors shown in Fig. 6 by 45 degrees will be used. 

The set of vectors, namely the vector in the upper layer and the two vectors in the lower layer, is not limited to the 
one shown in Fig. 6; it may be any set of vectors as long as the vector in the upper layer can be regarded as an average 
vector of the two vectors in the lower layer. 

(F3) Classification tree generating step 

In the pre-layering process (F2), all the segments of the training stroke (TPij) are developed into vectors from bot- 
tom to top pyramidally as shown in Fig. 5. To generate the classification tree, the vectors are processed in the opposite 
direction, namely, from top to bottom. 

In the following description, it will be assumed that the base vectors have eight directions, or there are eight vectors 
numbered 0 through 7 shown in Fig. 6. In this case, all the vectors in the vector pyramid will be covered by these base 
vectors. 

The topmost layer includes eight vectors; therefore, eight branches wiO extend from the root node as shown in Fig. 

At this time, the number of the training strokes (TPij) which exist in the branches is counted. Depending on the 
counting result, one of the following three type6 of processing will be implemented: 

1. If no training stroke (TPij) exists in a branch, then that particular branch is removed. 

2. If the strokes of only a certain category out of the training stroke (TPij) exist (e.g. only the strokes of 
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5 exist), then that particular, branch is turned into a leaf and assigned the category number (e.g. 

,,on )- 

w 

3. In other case than the cases described in 1 and 2 above, that is, if strokes of a plurality of categories are mixed, 
then that particular branch is turned into a node to continue the generation of the classification tree. 

Fig. 7 shows the processing result. In Fig. 7, the branches are indicated by the vectors in the topmost layer (here- 
15 inafter referred to as "the first layer) shown in Fig. 5. 

The branches with "X" indicated in the column showing the types of categories correspond to the case where no 
training stroke (TPij) exists, and therefore they are eliminated. 

The third branch from the left has the training strokes of only the category of 



20 



This corresponds to the case where the strokes of only one particular category (e.g. exist so that the branch is turned 
into a leaf. 

25 For instance, the fourth and fifth branches from the left have the training strokes of the categories 

"<", "L\ and "O"; 

30 they correspond to the case other than the cases 1 and 2, namely, the strokes of a plurality of categories are mixed. 
Thus, these branches provide nodes. 

The following will describe how to generate branches from the nodes. 

The most efficient method for generating branches from the nodes will be described. The most efficient method 
should enable as much information as possible on categories to be obtained when branches are developed. 
35 The following will describe the method for selecting a vector that permits highest efficiency when the branches are 
developed. 

The number of the training strokes of category No. i among the training strokes (TPi.j) which exist in a certain node 
is denoted as Ni. When the total number of the training strokes existing in the node is denoted as N, then the existence 
probability pi of each category in the node can be expressed as follows: 

40 

pi-Ni/N 

If the number of the types of categories in a certain node is 2, for example, then: 
45 2 

/«o 

so Therefore, the entropy at the time when the information on the node is obtained will be represented by the following 
expression: 

i, *N, N. ii, 

E "topy noc/ e =-ZP / ,0 9(P /)=£ a/ |0 9( Jj^JjL NiVogNAoQN,) Expression (1) 

55 /-0 /=0 /«0 

Then, a certain vector is selected in this node and the decrement of the entropy when a branch is developed is cal- 
culated. 
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As described above, the number of the branches developed from the single vector toward the lower layers is eight 
The distribution of the training stroke (TPij) among the eight branches is indicated by the number of the training strokes 
(TPi j) which exist in the developed branches, i.e.; 

Ni ( b 

where i of Ni.b denotes a category number and b denotes the branch number. 
At this time, the entropy at which the information on each branch is obtained is represented by the following expres- 
sion as is the case with the foregoing discussion: 

^ ^ N N 

Entropy u ^~£p f \oQip f )~Y i ^<Q(jr) Expression (2) 

7*0 1.0 b b 



b i~0 



In this expression, 



25 indicates the total number of the training strokes (TPij) which exist in the branches. 

The probability of distribution into each branch is expressed by: 

Nb/N 

30 where N is identical to N in the expression (1). Hence, the average entropy at the time when the branches are 

developed is represented by the following expression: 



7 2 



35 



Entropy branch =±Y, Y, N i,bV°9N b -togN fb ) Expression (3) 



6-0 M> 



The average decrement of the entropy is obtained by: 

40 EntropyDecrease^Entropy ^-Entropy branch Expression (4) 

A value obtained by dividing the value of K by the logarithm of the number of the branches as shown below repre- 
sents the classification efficiency when the branches are developed: 

45 EntropyDecrease _ _ 

BranchNumber Expression (5) 

A vector which gives this value a maximum value is selected to develop the branches. 

The branches may be developed in relation to a group of a plurality of vectors rather than developing only one vec- 
tor. In this case, BranchNumber in the expression (5) will be: 

50 

(Number of selected vectors) x 8 

In this embodiment, the value obtained in the expression (5) is adopted as the value which indicates the classifica- 
tion eff iciency when the branches are developed; however, it is obvious that the value is not limited to the one obtained 
55 by the expression (5) as long as it is a function representing the development efficiency of branches such as "Ginicrite- 
rion" described in a literature titled "Classif ication and Regression Trees". 

Thus, once a vector or a set of vectors to be developed are decided, the branches are developed and leaves and 
nodes are generated accordingly. Lastly, when all vectors have been turned into leaves, the classification tree is com- 
pleted. 
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The processing described above is illustrated in the form of a flowchart in Fig. 10. The procedure for generating the 
classification tree as shown in Fig. 8 will now be described. 

Firstly, in a step S1000, a noticed node is set as a root node as shown in Fig. 8. 

In a step S1001 , the set noticed node is checked for the three conditions set forth below: 

5 

1 . A training stroke exists. 

2. Training patterns of only one category exist. 

3. Training patterns of a plurality of categories exist 

w If the condition of 1 is satisfied, then the program proceeds to a step S1002. If the condition of 2 is satisfied, then 
the program proceeds to a step S1005. If the condition 3 is satisfied, then the program proceeds to a step S1006. 
In the step SI 002, the node is deleted from the classification tree. 

In a step S1003, all other nodes are checked if they have turned to leaf nodes. If the checking result is YES, then 
the program terminates the processing • if the checking result is NO, then the program proceeds to a step S 1 004 where 
is it selects another node as the noticed node. Then, the program goes back to the step S1001 to repeat the same 
processing. 

In the step S1005, the node is assigned the category number as a leaf node. The program then proceeds to the 
stepS1003. 

In the step S1006, one vector is selected from a vector siring included in the node according to the aforesaid 
20 entropy standard. 

In a step S1007, the branch of a pair of vectors of a layer under the selected vector is generated. 
Fig. 1 1 illustrates the processing implemented in this step; it shows the examples of the pairs of vectors in the lower 
layer. 

Referring to Fig. 1 1 , it is assumed that 5000 denotes a vector which has been selected in the step S1006 and which 
25 has a direction "2". There are eight different pairs of vectors in a lower layer, namely, 5001, 5002, 5003, 5004, 5005. 
5006, 5007, and 5008, that are matched to the vector 5000. Branches which take these pairs of vectors as new nodes 
are generated. 

The above has described a specific example of processing carried out in the step S1 007, 

In the following step, the program goes to a step S1008 where it sets one of the nodes of the generated branches 
30 as the next noticed node, then it goes back to the step S1 001 to repeat the same processing. 

Generating the classification tree as shown in Fig. 8 according to the procedure described above makes it possible 
to generate a classification tree which reflects detailed characteristic differences among similar categories while main- 
taining general classification of the stroke patterns which have many characteristics. Quick recognition of characters 
with a high recognition rate can be achieved by referring to the generated classification tree. 
35 In this embodiment, the method for generating the dictionary for online handwritten character recognition in accord- 
ance with the present invention has been described on the assumption that there is one training stroke. It is obvious, 
however, that the same processing according to the embodiment can be applied to process each stroke in a system 
which takes more than one stroke for an input character in actual use. 

The generation of the classification tree shown in Fig. 8 will be described in further detail. 
40 Fig. 8 is a diagram which adds the lower layers to the layers shown in Fig. 7; H omits the branches which have been 
deleted. The branches enclosed in boxes (□) shown in Fig. 8 indicate that they are leaves. 

All branches other than the leaves will be the nodes; therefore, further branch development will be implemented. 
Fig. 8 shows the result of the further branch development related to a second node (201) from the left. 

In the second node (201) from the left, three types of categories, namely, 

45 

"<", "l w , and n i n 

coexist, requiring the development of branches. 

so There is only one vector (the circled vector) that represents the state of the node; therefore, the vector to be devel- 
oped is uniquely decided. The pair of vectors to be developed is based on the pairs of vectors shown in Fig. 6. Specif- 
ically, a branch is developed to correspond to the eight combinations of the two vectors in the lower layer, the vector of 
an upper layer of which can be the vector in the bottom right direction (No. 3). This state means that the vectors have 
been developed to the second layer in Fig. 5. 

55 Further, a node (202) of a second branch from the left in development includes two categories, namely, 

"l M and *<*. 
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Thus, further branch development is necessary. It is assumed that the first vector of the two vectors representing the 
node has been selected to be developed as a result given in the step S107 tor discriminating a development vector. 
Then, eight branches are developed as is the case with the upper layer in relation to the state of the first vector, and 
some branches are deleted, some branches are turned into leaves, and some branches are turned into nodes. The 
5 branches which have turned into nodes must be further developed until the ends of all branches are eventually turned 
into leaves. 

Fig. 12 primarily shows the configuration inside a memory in an information processing unit to which the online 
handwritten character recognizing method in accordance with the embodiment is applied. A CPU 1201 corresponds to 
the CPU denoted by 403 in Fig. 1 ; it executes various types of processing described In this embodiment in accordance 
10 with control programs stored in a memory 1202 which will be discussed later. The control program for implementing the 
processing illustrated by a flowchart which will be described later is also stored in the memory 1202 and executed by 
the CPU 1201. 

The memory 1202 has a program section 1202-1 for storing the control programs for the CPU 1201 to execute var- 
ious types of processing and a data section 1202-2 for storing various parameters and data. The program section 

15 stores, for example, the individual parts of the flowchart shown in Fig. 10 as subroutine programs 1202-1-1 through 
1202-1-3. The subroutine programs include the processing program used in S1001 for discriminating the state of a 
noticed node, the processing program used in S1002 for deleting a node, the processing program used in S1005 for a 
leaf node, the processing program used in S1006 for selecting a proper vector, the processing program used in S1007 
for generating a branch of pairs of vectors, and the program for recognizing an input pattern by referring to a generated 

20 classification tree; these subroutine programs for the respective types of processing are stored in the program section 
1202-1. When executing each processing which will be discussed later, a control program is read from the memory 
1202 as necessary for the CPU 1201 to execute the processingi The data section 1202-2 has a training pattern buffer 
1202-2-1 for tentatively holding individual training patterns, an area 1202-2-2 for holding pyramidally developed pat- 
terns of vector data obtained from respective training patterns, and a classification tree buffer 1202-2-3 for holding a 

25 classification tree which is being generated. 

A hard disk drive (HDD) 1203 holds all training patterns and also holds the data on a classification tree generated 
by the method described in this embodiment. 

The memory ,1202 may be a built-in ROM, RAM, HD, or the like. The programs and data may be stored beforehand 
in the memory, or the programs or data may be read prior to processing from a storage medium such as a floppy disk 

30 (FD) or CD-ROM which may be removed from the main body of the apparatus. As another alternative, such programs 
or data may be read from another apparatus via a public line, LAN, or other communication means. 

(Character recognizing method based on a generated classification tree) 

35 In a second embodiment, a description will be given to a method for online handwritten character recognition by 
referring to a classification tree generated by the processing procedure which has been described in the foregoing 
embodiment. 

Fig. 9 shows a flowchart which provides a best illustration of the processing procedura 

In Fig. 9, reference character 301 denotes the data of a handwritten stroke entered by a user. The handwritten 
40 stroke is identical to the training stroke 101 shown in the first embodiment 

A step S302 is the step for dividing the handwritten stroke into stroke segments. 

A step S303 is the step for making the stroke segments vectors, wherein the stroke segments resulting from the 
process in the preceding step are turned into vectors. 

A step S304 is the step for pre-layering vector series obtained in the preceding step for making the stroke segments 
45 vectors. 

Reference character 305 denotes a layered vector series which has undergone the process of the prelayering step. 

A step S307 is a category discriminating step for determining the category of the handwritten stroke 301 according 
to the layered vector series 305 by referring to the classification data given by a classification tree 306. 

The classification tree 306 is a classification tree which provides the information necessary for classifying catego- 
50 ries; it should be the classification tree which can be generated using the method described in the first embodiment 

The same three types of processing used in the step S102 for dividing a stroke into stroke segments, the step S103 
for making stroke segments vectors, and the step SI 04 for pre-layering are used tor the foregoing step S302 for dividing 
a stroke into stroke segments, the step $303 for making stroke segments vectors, and the step S304 for pre-layering, 
respectively. 

55 There were as many layered vector series 305 as the training patterns in the first embodiment, while there is only 
one that is derived from the handwritten stroke in this embodiment 

In the category discriminating step S307, when a leaf is reached after tracing the layered vector series 305 accord- 
ing to the classification tree shown in Fig. 8, the category existing in the leaf is output as a recognition result. 

Fig. 13 primarily shows the configuration inside a memory in an information processing unit to which the online 
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handwritten character recognizing method in accordance with the embodiment is applied. A CPU 1301 corresponds to 
the CPU denoted by 403 in Fig. 1; it executes various types of processing described in this embodiment in accordance 
with control programs stored in a memory 1302 which will be discussed later. The control program for implementing the 
processing illustrated by a flowchart which will be described later is also stored in the memory 1302 and executed by 
5 the CPU 1301. 

The memory 1302 has a program section 1302-1 for storing the control programs for the CPU 1301 to execute var- 
ious types of processing and a data section 1302-2 for storing various parameters and data. The program section 
stores, tor example, the individual parts of the flowchart shown in Fig. 9 as subroutine programs. The subroutine pro- 
grams include the processing program used in S302 for dividing a stroke into stroke segments, the processing program 

w used in S303 for making stroke segments vectors, the processing program used in S304 for pre-layering, and the 
processing program used in S307 for discriminating a category; these subroutine programs for the respective types of 
processing are stored in the program section 1302-1 . When executing each processing which will be discussed later, a 
control program is read from the memory 1302 as necessary for the CPU 1301 to execute the processing. The data 
section 1302-2 has a buffer for holding patterns entered by the user, an area for holding a pyramidally developed pattern 

is of vector data obtained from the entered pattern, and a buffer for holding a recognition candidate of the input pattern. 
A hard disk drive (HDD) 1303 holds the data on a classification tree generated by the method described in the pre* 
ceding embodiment. 

The memory 1302 may be a built-in ROM, RAM, HD, or the like. The programs and data may be stored beforehand 
in the memory, or the programs or data may be read prior to processing from a storage medium such as FD or CD-ROM 
20 which can be removed from the main body of the apparatus. As another alternative, such programs or data may be read 
from another apparatus via a public line, IAN, or other communication means. Thus, according to this embodiment, 
extremely quick online handwritten character recognition can be achieved with a high recognition rate by employing the 
generated stroke vector classification tree which successfully reflects the competitive relationship among categories. 

25 (Second Embodiment) 

In a second embodiment, an example will be described in which an N-gram table is generated according to a clas- 
sification tree which has been generated by layering training patterns. 

In the layering process of training patterns in accordance with this embodiment, the characteristics of the training 
30 pattern will be degenerated in steps. 

In generating the classification tree according to this embodiment, a variable is selected which ensures the sever- 
est competition among categories in an upper layer and good separation of the categories in a layer immediately there- 
under, and the foregoing degenerated variable is developed toward lower layers. 

The training stroke in this embodiment is divided and the stroke segments resulting from the division is turned into 
35 vectors, and the resulting vector series are pyramidally layered to make layered vector series. The layered vector series 
are used to generate a classification tree, and the N-gram table is generated according to the generated classification 
tree. 

In the pre-layering process according to the second embodiment, the vector series information constituting the 
training stroke is degenerated in steps. 
40 In the classification tree generating process according to this embodiment, a vector is selected which ensures the 
severest competition among categories in an upper layer and good separation of the categories in a layer immediately 
thereunder. Based on the result, the degenerated vector is developed toward the lower layers. 

Further, in the embodiment, a sentence entered by the user is recognized by referring to the generated N-gram 
table. 

45 In conjunction with the accompanying drawings, an information processing apparatus according to this embodi- 
ment of the present invention will now be described in detail. 

(In the case of an image) 

so Fig. 14 is a block diagram showing the configuration of an information processing apparatus to which the pattern 
recognition system involved in the following entire embodiment of the present invention will be applied. 

A pattern recognition apparatus is comprised of a pattern input device 1 401 , a display 1 402, a central processing 
unit (CPU) 1 403, and a memory 1 404. 

The pattern input device 1 401 has, for example, a digitizer and a pen if it is adapted for online character recognition; 
55 it hands the coordinate data of a character or graphic drawn using the pen on the digitizer over to the CPU 1403. The 
pattern input device may be a scanner, microphone, etc. as long as it enables the input of a pattern, which is to be rec- 
ognized, as an image. 

The display 1402 displays the pattern data entered in the pattern input device 1401 and a result of the recognition 
by the CPU 1403. 
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The CPU 1403 recognizes an entered pattern and also controls all the devices involved. 

The memory 1404 stores a recognition program or a dictionary employed by the CPU 1403 and also tentatively 
stores entered patterned data, variables used by the recognition program, eta 

Fig. 15 is a conceptual diagram illustrating the information processing procedure of the embodiment in accordance 
with the present invention. Reference character 1501 denotes training patterns, and S1502 denotes a pre-Jayering step 
for applying the training patterns 1501 to a neural network. Reference character 1503 indicates layered training patterns 
which have undergone the processing by the neural network; and S1504 indicates a step for generating a classification 
tree according to the layered training patterns 1503. 

Reference character $1505 denotes a step for discriminating development variables used in the process of gener- 
ating a classification tree in the classification tree generating step S1 504. 

Reference character $1506 indicates a classification tree generated by the processing implemented in the step 
S1504. 

Reference character S1507 indicates a sentence database; the sentence database includes a variety of sentence 
patterns generally used. The sentence database is accessed in an N-gram generating step, which will be discussed 
later, for determining a prior probability with a classification tree which has been generated in advance. 

Reference character S1 508 indicates an N-gram table generating step for generating an N-gram table 1 509 accord- 
ing to the sentence database 1507 and the classtfication tree 1506. The inputs in this embodiment are the training pat- 
terns 1 501 and the sentence database 1507 and the output thereof is the N-gram table 1509. 

Referring now to Fig. 16 through Fig. 20, the processing procedure in accordance with the embodiment will be 
described in detail. 

Firstly, it is assumed that there are ten numeral patterns from 0 to 9 written as input patterns on a 1 6x16 mesh. An 
example of the input pattern of 0 is shown in Fig. 18. 

There are 100 training patterns each for 0 to 9 for generating a dictionary. These are named as: 

LTi j (Learning Te mplate I j) 

where i denotes a suffix representing a category and it takes a value in the following range: 
0 <. i £ 9 

where j denotes a suffix representing a training pattern number and it takes a value in the following range: 
1*j<;l00 

A four-layer neural network as shown in Fig. 16 is configured. 

The four layers shown in Fig. 16 are respectively composed of groups of neurons of 2 x 2, 4 x 4, 8 x 8, and 16x16 
pieces from the top layer to the bottom layer. 

The method for generating a dictionary for pattern recognition is composed of three steps, namely, a neural net 
development step a classification tree generating step, and an N-gram table generating step. Each of these steps will 
be described in order with reference to Fig. 1 7. 

(F171) Neural net development step 

Firstly, the training template is input to the bottommost layer of 16x16 neurons shown in Fig. 16. At this time, it is 
assumed that the neurons in the white portion of the input pattern (LTi j) are OFF, while the neurons in the Wack portion 
are ON. Hereafter, "black" will means that the neurons are ON, and Vhrte" will mean that the neurons are OFF. 

The configuration of the neural net is extremely simple; if any one neuron that is ON exists in the 2 x 2 neurons of 
a lower layer, then one neuron of the layer immediately above the layer should be ON. This rule applies in processing 
the input pattern upward. 

Fig. 19 shows a result of the processing carried on the training template shown in Fig. 18. 

Eventually, the characteristic space of the input pattern forms a 256-dimensionaI hypercubic lattice which has 2 256 
different combinations of data. 

The number of combinations of data will be 2 4 in a first layer, 2 16 in a second layer, and 2 s4 in a third layer. 

The configuration of the neural net is not limited thereto. 

(F172) Classification tree generating step 

In the neural net development step F1 71 , all training templates (LTi j) are developed to the neural net shown in Rg. 
16. The classification tree is generated from top to bottom, which is opposite from the case of the neural net develop- 
ment. 

The node of the rule begins with a neuron which virtually exists above the topmost layer (2 x 2) shown in Fig. 16. 
As a result of developing the training templets (LTi j), some neurons of the topmost layer (2 x 2) shown in Rg. 16 
are ON. In other words, the topmost layer (2 x 2) is not completely turned OFF unless a completely white training tem- 
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plate exists. 

Thus, the neurons of the topmost layer which virtually exists are ON relative to afl the training templates (LTij). 

There are 2 4 = 1 6 states of the topmost layer (2 x 2). To be more accurate, there are 15 states since not all neurons 
are OFF as described above; therefore, 16 branches extend from a root node as shown in Fig. 20. 
5 At this time, the number of the training templates (LTij) which are present in the branches is counted. Depending 
on the counting result, one of the following three types of processing will be implemented: 

(1) If no training template (LTij) exists in a branch, then that particular branch is removed. 

(2) If the templates of only a certain category (e.g. "1 ") out of the training templates (LTij) exist, then that particular 
w branch is set as a leaf and assigned the category number (e.g. "1 "). 

(3) In other case than the cases described in (1) and (2) above, that is, if templates of a plurality of categories are 
mixed, then that particular branch is set as a node to continue the generation of the classification tree. 

Fig. 20 shows the processing result 
15 The branch states are indicated by showing the ON/OFF of the neurons of the topmost (first) layer of Fig. 16. Spe- 
cifically, the black portion indicates the neurons that are ON, while the white portion indicates the neurons that are OFF. 

The branches with "X" indicated in the column showing the types of categories correspond to the case (1) where 
no training templates (LTij) exist, and therefore they are eliminated. 
Strictly speaking, the leftmost branch does not extend from the root 
20 The eighth branch from the left has the training templates of only the category 1 . This corresponds to the case (2) 
where the templates of only one particular category (ag. "1 ") of the training templates (LTij) exist, so that the branch is 
turned into a leaf. 

For instance, the twelfth branch from the left has the training templates of the categories 2, 4, 5, and 6; this corre- 
sponds to the case (3) rather than the case (1 ) or (2), namely, the templates of a plurality of categories are mixed. Thus, 
25 this branch provides a node. 

The following will describe how to generate branches from the node. 

The most efficient method for generating branches from the node will be described. The most efficient method 
should enable as much information as possible on categories to be obtained when branches are developed. 

Generally, there are so many ways to develop the branches under such conditions that it is difficult to decide which 
30 one to adopt. This has been hitherto an obstacle to successful generation of a classification tree used for recognition. 

An attempt will be made to limit the branches to be developed from the node to only one branch wherein the neu- 
rons that are ON are developed to lower layers at this node. 

For instance, in the case of the twelfth branch from the left shown in Fig. 20, one of the three neurons, namely, the 
top left, bottom left, and bottom right neurons of the first layer shown in Fig. 16, is selected, and the branch related to 
35 the states of the neurons under the selected neuron, i.e. the states of the bottom four neurons of the second layer of 
Fig. 1 6, is developed. This permits significantly reduced time for the calculation required to develop the branch. In addi- 
tion, such limitation essentially exerts no serious damage to the classifying performance of the classification tree to be 
generated. 

A description will now be given to a method for selecting a neuron among the neurons that are ON at the node, 
40 which neuron enabling highest efficiency in the development 

The number of the training templates of category No. i among the training templates (LTij) which exist in a certain 
node is denoted as Ni. When the total number of the training templates existing in the node is denoted as N, then the 
existence probability pi of each category in the node can be expressed as follows: 

45 pUNi/N 

where 



9 

50 NmJ^Ng 

Therefore, the entropy at the time when the information on the node is obtained will be represented by the following 
55 expression: 

Entropy oe*-EpWP/)-'Zw' i ^77'> = te A/ /(*« A/ - to fl N i) Expression (6) 

/=o ;=o /=o 
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Then, one of the neurons which are ON at this node is selected and the decrement of the entropy when a branch is 
developed therefrom is calculated. 

As described above, the number of the branches developed from the single neuron toward lower layers is sixteen. 
The distribution of the training templates (LTI j) among the sixteen branches is indicated by the number of the training 
5 templates (LTij) which exist in the developed branches, i.e.; 

Ni,b 

where i of Ni,b denotes a category number and b denotes the branch number. 
10 At this time, the entropy at which the information on each branch is obtained is represented by the following expres- 
sion as is the case with the foregoing discussion: 

9 9 N N 

Entropy Dranch ^^ ,)=-£( -J^^JT Expression (7) 

15 M) M> b b 

^I^^Oog/V^ogA/^) 



20 In this expression, 

25 

indicates the total number of the training templates (LTij) which exist in the branches. 
The probability of distribution into each branch is expressed by: 

30 Nb/N 

where N is identical to N in the expression (6), and therefore, the average entropy at the time when the branches 
are developed is represented by the following expression: 

35 16 9 

Er ^tropy t/anc *=ft£ ^N^ilogN^ogN^) Expression (8) 



40 The average decrement of the entropy is obtained by: 



Entropy Decrease* Entropy Entropy branch Expression (9) 

A value obtained by dividing this value by the number of the branches as shown below represents the classif ication 
45 efficiency when the branches are developed: 



E TS£EZ Expression (10) 



BranchNumber 

A neuron which gives this value a maximum value is selected to develop the branches. 
so The branches may be developed in relation to a group of a plurality of neurons rather than developing only one neu- 
ron 

In this case, BranchNumber in the expression (10) will be obtained by multiplying the number of neurons by 16. 
Technically, it is impossible to expect a state where all neurons of the lower layers involved in the development are OFF; 
therefore, BranchNumber will be the number of neurons multiplied by 15. 
55 In this embodiment, the value obtained in the expression (10) is adopted as the value which indicates the classifi- 
cation efficiency when the branches are developed; however, it is obvious that the value is not limited to the one 
obtained by the expression (10) as long as it is a function representing the development eff iciency of branches such as 
"Ginicriterion* described in the literature titled "Classification and Regression Trees". 

Thus, once a neuron or a set of neurons to be developed are decided, the branches are developed and leaves and 
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nodes are generated accordingly. 

Lastly, when all neurons have been turned into leaves, the classification tree is completed. 
Fig. 8 shows the contents of the classification tree which has actually been generated. 
Fig. 8 gives more details of Fig. 7; it omits the deleted branches. The circled branches in Fig. 8 indicate that they 
5 are leaves. 

All branches other than the leaves will turn into nodes; therefore, further branch development will be implemented. 
Fig. 21 shows the result of the further branch development related only to the third node from the right 

In the third node from the right, three types of categories, namely, "1", V\ and "9" coexist, requiring the develop- 
ment of branches. It is assumed that the top right neuron of the first layer has been selected to be developed in the first 
w layer as a result given in the step S1505 (Fig. 1 5) for discriminating a development variable. 

Then, 2 4 ■ 16 branches are developed as is the case shown in Fig. 20 in relation to the state of the top right neuron, 
and some branches are deleted, some branches are turned into leaves, and some branches are turned into nodes. 

The branches which have turned into nodes must be further developed until the ends of all branches are eventually 
turned into leaves. 

is In Fig. 2 1 , for the purpose of clarity, the first layer and the second layer are superimposed to show the development 
result of the third node from the right Actually, these states are represented by the four neurons of the first layer and 
the four top right neurons of the second layer of the neural net illustrated in Fig. 16. 

(F173) N-gram table generating step 

20 

As illustrated in Fig. 21 , the first layer of the classification tree obtained as a result of the classification tree gener- 
ating step (F1 72) is equivalent to the general classification of all categories to be recognized, the general classification 
being based on the shapes thereof. 

Hence, generating an N-gram according to the category groups in the general classification should provide highly 
25 reliable state transition probability for less database. In this case, the category groups in the general classification are 
regarded as virtual category groups. 

it should be noted, however, that the first layer of the classification tree obtained as a result of the classification tree 
generating step is not always exclusive. 

For example, in Fig. 21, the category 1 exists in four branches or nodes. This phenomenon is generally known as 
30 "overlap classes" which is referred to, for instance, in a literature titled "A Survey of Decision Tree Classifier Methodol- 
ogy" (IEEE Transactions on Systems, Man, Cybernetics vol. 21 , No. 3, May/June 1991). 

There is a method for making the "overlap classes" exclusive: the branch which has the highest probability of the 
presence of a certain category is set as the branch dedicated to that particular category. Referring now to Fig. 21, if the 
probability of the category 1 being present is the highest in the second branch from the left, then the category 1 existing 
35 in the first, third, and sixth branches from the left is ignored. 

An example of the category groups of the general classification thus generated is shown in Rg. 22. 
In Rg. 22, the categories marked with circled numbers have the highest probabilities of presence. 
For example, according to the diagram, a category 1 from the left forms a first category group, categories 4 and 6 
form a second category group, categories 7 and 9 form a third category group, and categories 0, 2, 3, 5, and 8 form a 
40 fourth category group. As a result the originally ten categories have been reduced to the four groups. These four groups 
are used as new virtual category groups to generate the N-gram. 

The N-gram table thus generated can be incorporated in a general sentence recognition algorithm although it has 
been generated using the information of the classification tree which has been obtained by the classification tree gen- 
erating step. This means that the N-gram table may be used to determine the prior probability of a sentence and a corrt- 
45 pletely different recognition algorithm may be used for determining a post probability without using the foregoing 
classification tree. 

Obviously the N-gram table may be built in a publicly known algorithm such as DP matching or full search algorithm 
for recognizing a sentence by determining the prior probability of the sentence using the N-gram table so as to deter- 
mine the prior probabilities of ail patterns including all combinations of the pattern shapes constituting the sentence. 
so In the above description, the first layer of the classification tree has been regarded as the general classification tree 
category; however, the general classification category group may be comprised of any number of layers up to nth layer. 

(In the case of strokes) 

55 Fig. 23 illustrates the processing procedure according to the second embocfiment 

Reference character 2301 denotes a training stroke; in a stroke dividing step S2302, the training stroke is divided 
into a plurality of stroke segments. 

In a step S2303 for making the stroke segments vectors, the stroke segments resulting from the stroke dividing step 
S2302 are quantized into vectors. 
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In a pre-layering step S2304, vector series obtained as a result of the step S2303 for making the stroke segments 
vectors are layered to generate layered vector series 2305. This processing will be discussed in detafl later. 

In a classification tree generating step S2306, a classification tree 2308 is generated according to the layered vec- 
tor series 2305. 

A step S2307 for (fiscriminating a development vector is implemented in the course of generating the classification 
tree in the classification tree generating step S2306. 

In an N-gram table generating step S2310, an N-gram table 231 1 is generated according to a sentence database 
2309 and the classification tree 2308. 

The input in the processing flowchart shown in Fig. 23 is a training pattern. i.e. the training stroke 2301, and the 
sentence database 2309, and the output is an N-gram, namely, the N-gram 231 1 . 

Referring now to Fig. 24, the method for generating the N-gram will be described in detail. 

For easier understanding of readers, three different characters 

M < w , "l w , and w o" 

which read "ku", "shi", and Isu", respectively each of which is drawn in one stroke, will be taken as the examples rep- 
resenting the categories to be recognized. 

It is assumed that there are one hundred training patterns each for 

"<", "L M , and M ^", 



respectively, for generating the dictionary; these are denoted as follows: 

TPij (Training Pattern i, j) 

where i is a suffix denoting the category and it takes a value in the following range: 
0£i^2 

j is a suffix denoting a training pattern number and it takes a value in the following range: 
1<;j£l00 

As illustrated by the flowchart shown in Fig. 24, the method of generating the dictionary for online handwritten char- 
acter recognition is composed of four steps, namely, a vector generation step, a pre-layering process step, a classifica- 
tion tree generation step, and an N-gram table generating step. The vector generation step, the pre-layering process 
step, and the classification tree generation step are identical to those that have been described in the first embodiment 
by referring to Fig. 3; therefore, only the N-gram table generating step will be described. 

(F24) N-gram table generating step 

An N-gram table will be generated according to the classification tree which has been made as described in (F3) 
Classification tree generating step. 

The unit of each element of the N-gram described above has been a word composed of one character; however, it 
is obvious that the unit may alternatively be a clause composed of a word or the like. 

The present invention may be applied to a system constituted by a plurality of units or to an apparatus constituted 
by a single unit. 

Apparently, an object of the present invention can be accomplished by supplying a storage medium, in which the 
program codes of software for implementing the functions of the foregoing embodiments have been recorded, to the 
system or apparatus, so that a computer, CPU, or MPU of the system or the apparatus can read the program codes 
from the storage medium and execute them. 

In this case, the program codes themselves read from the storage medium would implement the functions of the 
embodiments, and the storage medium storing the program codes would constitute the present invention. 

The storage medium for supplying the program codes may be a floppy disk, hard disk, optical disk, magneto-optical 
disk. CD-ROM, CD-RAM, magnetic tape, nonvolatile memory card, ROM, or the like. 

Obviously, the present invention also includes a case where executing the program codes which have been read 
by the computer causes the functions of the foregoing embodiments to be implemented and also causes an operating 
system (OS) or the like running on the computer to perform a part or all of actual processing in accordance with the 
instructions of the program codes, thus accomplishing the functions of the foregoing embodiments. 

Furthermore, it is apparent that the present invention also includes a case where the program codes read from a 
storage medium are written to a feature expansion board inserted in a computer or a memory provided in a feature 
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expansion unit connected to a computer, then a CPU provided in the feature expansion board or the feature expansion 
unit executes a part or ail of actual processing in accordance with the instructions of the program codes, thus accom- 
plishing the functions of the foregoing embodiments. 

When applying the present invention to the foregoing storage medium, the program codes matching to the flow- 
5 chart which has previously been described are stored in the storage medium. To be brief, the respective modules 
shown by an example of a memory map given in Fig. 25 will be stored in the storage medium. 

More specifically, the program codes of at least the following modules will be stored in the storage medium: a pre- 
layering process module for the steps S1502 or S2304; a classification tree generation module for the processing of the 
steps SI 504 or S2306; a development variables discrimination module for the step S1505 or S2307; an N-gram table 
10 generation module for the step S1 508 or S231 0; a stroke division module for the step S2302; and a vector forming mod- 
ule for the step S2303. 

Thus, according to the present invention, an advantage is provided in which a highly reliable N-gram table can be 
generated even with a smaller database by generating the N-gram table based on the groups resulting from the general 
classification of the categories of the patterns constituting a sentence. 
15 There is another advantage in that the grouping of the categories is carried out such that the similarity of pattern 
shapes is successfully reflected, thus permitting a high recognition rate of sentences. 

(Third Embodiment) 

20 In a third embodiment an example will be described wherein sub-patterns extracted from an input pattern are lay- 
ered to generate a classification tree. 

A preferred embodiment of the present invention will be described in conjunction with the accompanying drawings. 

(In the case of images) 

25 

Fig. 26 is a block diagram showing the configuration of an information processing apparatus related to the following 
all embodiments in accordance with the present invention. 

The apparatus is comprised of a pattern input device 2601, a display 2602, a central processing unit (CPU) 2603, 
and a memory 2604. 

30 The pattern input device 2601 , for example, has a digitizer and a pen if it is intended for online character recogni- 
tion; it hands the coordinate data on a character or graphic, which has been entered on the digitizer by using the pen, 
over to the CPU 2603. The pattern input device may be a scanner for optically reading an image, a microphone for 
receiving voice, or any other means as long as it receives a pattern to be recognized; a pattern which has been entered 
through these input means may even be entered through a communication means. The display 2602 displays a pattern 

35 data entered through the pattern input means 2601 and also a recognition result given by the CPU 2603; it may be a 
CRT, LCD display, or the like. The CPU 2603 primarily recognizes an input pattern and controls all constituent devices. 
The memory 2604 stores a recognition program and a dictionary used by the CPU 2603 and temporarily stores input 
patterns, and variables, etc. used by the recognition program. 

Fig. 27 is a diagram which provides a best illustration of the functional configuration of the embodiment Reference 

40 character 2701 denotes training patterns; 2702 denotes a sub-pattern extractor for cutting out the training patterns by 
each sub-pattern; 2703 is a pre-layering processor for pyramidally developing sub-patterns; 2704 denotes layered train- 
ing sub-patterns; 2705 denotes a classification tree generator which generates a classification tree according to the lay- 
ered training sub-patterns; 2706 denotes a development variables discriminator used by the classification tree 
generator to generate a classification tree; and 2707 denotes a classification tree generated by the classification tree 

4$ generator. The input in this aspect of the present invention is a training pattern and the output is a classification tree. 

Fig. 28 shows primarily the configuration inside the memory of an information processing apparatus to which the 
online handwritten character recognizing method according to the embodiment is applied. A CPU 2801, which is similar 
to the one denoted as 2603 in Fig. 26, executes various types of processing described in the embodiment according to 
control programs stored in a memory 2802 to be discussed later. The processing shown by a flowchart to be described 

so later is also implemented by the CPU 2801 according to the control program for the processing which is stored in the 
memory 2802. 

The memory 2802 has a program section 2802-1 for storing the control programs for the CPU 2801 to execute var- 
ious types of processing, and a data section 2802-2 for storing various parameters and data. The program section 
stores, for example, the individual parts of the flowchart shown in Fig. 33 as subroutine programs. The subroutine pro- 
55 grams are: the processing program for the step S1 001 for discriminating the state of a noticed node; the processing pro- 
gram for the step S1002 for deleting nodes; the processing program for the step S1005 for leaf nodes; the processing 
program for a step S3306 for selecting a proper neuron; the processing program for a step S3307 for generating a 
branch of neurons; and a program for recognizing an input pattern by using a generated classification tree. The subrou- 
tines for these types of processing are stored in the program section 2802-1 . When executing each type of processing 
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to be described later, the control program for the processing is read from the memory 2802 and executed by the CPU 
2801 . The data section 2802-2 has a training pattern buffer for tentatively holding training patterns, an area for holding 
pyramidally developed training patterns, and a classification tree buffer for holding a classification tree which is being 
generated. 

A hard disk drive (HDD) 2803 holds all training patterns and also holds the data of a classification tree which has 
been generated according to the method described in the embodiment The data of the classification tree makes it pos- 
sible to trace the route indicated by the classification tree shown in Fig. 34. 

The memory 2802 may be a built-in ROM, RAM, HD, or the like. The programs and data may be stored beforehand 
in the memory, or the programs or data may be read prior to processing from a storage medium such as a floppy disk 
(FD) or CD-ROM which may be removed from the main body of the apparatus. As another alternative, such programs 
or data may be read from another apparatus via a public line, LAN, or other communication means. 

An input device 2804 is used to enter a pattern to be recognized using a classification tree stored in the HDD 2803; 
a scanner may be used to recognize an irhage pattern by referring to a classification tree generated using optically 
entered training image patterns. A pen and digitizer or touch panel may be used to recognize stroke data entered using 
a pen; or a microphone may be used to recognize voice data. 

Such recognition data may obviously be captured through the foregoing input means of another apparatus via a 
public line, LAN, eta in addition to being directly entered through the input means. 

Referring now to Fig. 29 through Fig. 33, the operation of the present invention will be described. 

First, as the input patterns, ten numerals (categories) from 0 to 9 written on an 8 x 8 mesh will be taken. An input 
pattern of 0 is shown at the bottom of Fig. 3 1 . 

It is assumed that there are 1 00 training patterns each for 0 to 9 for generating a dictionary. This means that there 
will be a total of 1000 training patterns since there are a total of 10 categories. These are named as LTi,j (Learning Tem- 
plate i j), where i denotes a suffix representing the categories of 0 to 9, and it takes a value in a range of 0 4 i £ 9; and 
j denotes a suffix representing a training pattern number, and it takes a value in a range of 1 £ j s 100. 

The method of generating the dictionary for pattern recognition in accordance with the embodiment is composed 
of three steps, namely, a step for extracting sub-patterns, a step for pyramidal development, and a step for generating 
a classification tree. These steps will be described in order in conjunction with the flowchart given in Fig. 29. 

(F2901) Sub-pattern extraction step 

In a step F2901 for extracting sub-patterns, a training pattern 400 is divided into sub-patterns as illustrated in Fig. 
30, and the sub-patterns are extracted. Fig. 30 illustrates the training pattern 400, namely, a written alphabet "A", which 
has been divided into a total of nine sub-patterns of 3 x 3 matrix and extracted. The sub-patterns may be extracted in 
a manner as shown in Fig. 30 wherein the sub-patterns are overlapped, or they may not be overlapped. The sub-pat- 
terns are extracted according to a sub-pattern extracting rule stored in the memory 2802. 

Fig. 31 detailedly illustrates the step for extracting the sub-patterns. Fig. 31 shows a process in which a central por- 
tion of a training pattern, namely, a written numeral 0, is being extracted. The central portion may be considered as 
equivalent to a sub-pattern 401 out of the 3 x 3 sub-patterns shown in Fig. 30. 

In Fig. 31, a training pattern 501 is represented in an 8 x 8 bit map, and a total of nine (3 x 3) sub-patterns repre- 
sented in a 4 x 4 bit map are extracted. 

(F2902) Pyramidal development step 

Each of the nine sub-patterns which have been extracted in the sub-pattern extraction step F2902 will have a pyr- 
amid work of three layers 502 to 504 as shown in Fig. 31 . In Fig. 31 . the topmost layer 504 is composed of a group of 
1 x 1 neuron, the middle layer 503 is composed of a group of 2 x 2 neurons, and the bottommost layer 502 is corrposed 
of a group of 4 x 4 neurons. 

An extracted training sub-pattern is first input to the bottommost layer of 4 x 4 neurons shown in Fig. 31 . At this time, 
it is assumed that the neurons in the white portion of the input pattern (L71 j) 501 are OFF, while the neurons in the black 
portion are ON. Hereafter, "black" win means that the neurons are ON. and "white" will mean that the neurons are OFF. 

The configuration of the pyramid is extremely simple; if any one neuron that is ON exists in the 2 x 2 neurons of a 
lower layer, then one neuron of the layer immediately above the layer should be ON. In Fig. 31, neurons 507 and 508 
out of neurons 505 to 508 in the sub-pattern 502 are ON; therefore, a neuron 509 corresponding to the neurons 507 
and 508 is also ON. This rule applies in processing the input patterns upward. The configuration or rule of the pyramid, 
however, is not limited thereto; as an alternative, a black neuron may be counted as 1 and when a mean value exceeds 
a threshold value, an upper neuron is turned ON, or other rule may be adopted as long as the state of an upper neuron 
is decided by the states of a plurality of lower neurons. 

The processing for deciding the states of upper neurons according to the states of lower neurons is carried out on 
all the neurons constituting a sub-pattern, and the processing is repeated for all the sub-patterns. 
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(F2903) Classif ication tree generating step 

All the training patterns (LTi.j) are pyramidally developed as Dlustrated in Rg. 31 in the pyramidal development step 
F2902. The classification tree will be generated from top to bottom, that is, in the opposite direction from that of the 
5 pyramidal development F2902. 

The node of a root begins with the neuron of the topmost layer (1 x 1) of Fig. 31 . 

As a result of the pyramidal development of the training sub-patterns (LTij), a neuron or neurons of the pattern 503 
(2 x 2) of the second layer of Rg. 31 should be ON. This is because all neurons of the second layer (2 x 2) do not turn 
OFF unless a completely white training sub-pattern exists, according to the rule employed for the embodiment Hence, 
io the state of the neuron of the topmost layer (1 x 1) is ON with respect to all the training sub-patterns (LTij). 

There are sixteen (2 4 ) states of the second layer (2 x 2) (strictly speaking, there are fifteen states since there is no 
such a state where all neurons are OFF as described above); therefore, sixteen branches extend from the root node as 
shown in Rg. 32. 

The states of the branches shown in Rg. 32 are indicated by showing the ON state of the group of neurons of the 
15 second layer shown in Rg. 31 wherein the black areas indicate ON, while the white areas indicate OFF. 

The branches with "X" indicated in the column showing the types of categories correspond to the case 1 where no 
training sub-patterns (LTij) exist, and therefore they are eliminated. (Strictly speaking, the leftmost branch does not 
extend from the root.) 

The aghth branch from the left has the training sub-patterns of only the category 1. This corresponds to the case 
20 2 where the sub-patterns of only one particular category (e.g. "1") of the training subpatterns (LTij) exist, so that the 
branch is turned into a leaf. 

For instance, the fifth branch from the right has the training sub-patterns of the categories 4. 5, and 6; this cone- 
sponds to the case 3 rather than the case 1 or 2, namely, the sub-patterns of a plurality of categories are mixed. Thus, 
this branch provides a node. 

25 The processing for generating a classification tree is illustrated by the flowchart given in Rg. 33. The following will 
describe the procedure for generating a classification tree as shown in Rg. 35. The steps S1 000 to S1 005 are the same 
as those in the first embodiment described in conjunction with Fig. 10; therefore, only steps S3306 to S3308 win be 
described. 

In the step S3306, one out of the neurons included in a node is selected according to the entropy standard. 
30 In the step S3307, a branch of the set of neurons of a lower-rank layer of the selected neuron is generated. 

Rg. 34 illustrates the processing implemented in this step; it shows an example of the set of neurons of the lower- 
rank layer when a top left neuron has been selected. 

Referring to Rg. 34, it is assumed that a neuron 900 is the neuron which has been selected in the step S3306. 
There are fifteen different combinations of the states of neurons in the lower-rank layer corresponding to the selected 
35 neuron, that is, there are fifteen different patterns for the lower-rank layer. Each of these combinations provides a new 
node for generating a branch. 

The description has been given to the processing implemented in the step S3307. 

The program then proceeds to the step S3308 where it sets one of the nodes of the generated branches as the next 
noticed node. In the step S3308, the program moves the noticed node and goes back to the step S1001 to repeat the 
40 same processing. 

Generating the classification tree as shown in Rg. 35 according to the procedure described above makes .ft possi- 
ble to generate a classification tree which reflects detailed characteristic differences among similar categories while 
maintaining general classification of the patterns which have many characteristics. Quick recognition of characters with 
a high recognition rate can be achieved by referring to the generated classification tree. 

45 The following will describe how to generate branches from the nodes. 

The method for generating branches from the nodes will now be described. The above description has been given 
to the case where the top left neuron had been selected. Naturally, efforts are made to accomplish most efficient gen- 
eration of branches in developing the branches from the nodes. High efficiency is achieved by selecting neurons which 
enable as much information as possible on categories to be obtained when branches are developed. 

so Generally, there are so many ways to develop the branches under such conditions that it is difficult to decide which 
one should be adopted. This has been hitherto an obstacle to successful generation of a classification tree used for rec- 
ognition. 

An attempt wiB be made to limit the branches to be developed from the node to the ones wherein the neurons that 
are ON are developed to lower layers at this node. For instance, in the case of the fifth branch from the right shown in 
55 Rg. 32, one of the three neurons, namely, the top left, bottom left, and bottom right neurons of the second layer shown 
in Rg. 31, is selected, and the branches related to the states of the bottom four neurons of the third layer under the 
selected neuron shown in Rg. 31 are developed. 

This permits significantly reduced time for the calculation required to develop the branches. In addition, such limi- 
tation essentially exerts no serious damage to the classifying performance of the classification tree to be generated. 
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A description will new be given to a method for selecting a neuron among the neurons that are ON at the node, the 
neuron enabling the highest efficiency in the development 

The number of the sub-patterns of category No. i among the training sub-patterns (LTiJ) which exist in a certain 
node is denoted as Ni. When the total number of the training sub-patterns existing in the node is denoted as N, then the 
5 existence probability pi of each category in the node can be expressed as follows: 

pi = Ni/N 

where 2 
10 ' N^N f 

/-o 

Therefore, the entropy at the time when the information on the node is obtained will be represented by the following 
expression: 

2 2 W, 1 9 

B*Wao+-%P / ,0 9(p f )~I t ij$)=} / T t Ni(logN-\oQN f ) Expression (1 1) 

20 

Then, one of the neurons which are ON in this node is selected and the decrement of the entropy when a branch 
is developed therefrom is calculated. 

As described above, the number of the branches developed from the single neuron toward lower layers is sixteen. 
The distribution of the training sub-patterns (LTi j) among the sixteen branches is indicated by the number of the training 
25 sub-patterns (LTI j) which exist in the developed branches, Le.; 

Ni,b 



where i of Ni,b denotes a category number and b denotes a branch number. 
30 At this time, the entropy at which the information on each branch has been obtained is represented by the following 
expression as is the case with the foregoing discussion: 



\N lh _ N, 



35 hO /-0 



Entr °Pybmnch=-T,1> / ,0 9<P /> = -E7p l0 9(7^> Expression (12) 



40 In this expression, 

/-o 

45 



indicates the total number of the training sub-patterns (LTi j) which exist in the branches. 
The probability of distribution into each branch is expressed by: 

so Nb/N 

where N is identical to N in the expression (11), and therefore, the average entropy at the time when the 
branches are developed is represented by the following expression: 



55 16 9 

0-1 /~o 



10 a 

Entropy branch- jjT, ^^{logN^logN^) Expression (13) 
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The average decrement of the entropy is obtained by: 



Entropy Decrease* En fropy ^-Entropy branch Expression (14) 

5 A value obtained by dividing this value by the logarithm of the number of the branches as shown below represents 
the classification efficiency when the branches are developed: 

10 A neuron which gives this value a maximum value is selected to develop the branches. 

The branches may be developed in relation to a group of a plurality of neurons rather than developing only one neu- 
ron. 

In this case, BranchNumber in the expression (15) will be obtained by multiplying the number of neurons by 16. 
Technically however, it is impossible to expect a state where all neurons of the lower layers involved in the development 

is are OFF. To be accurate, therefore. BranchNumber will be the number of neurons multiplied by 15. In this embodiment, 
the value obtained in the expression (15) is adopted as the value which indicates the classification efficiency when the 
branches are developed; however, it is obvious that the value is not limited to the one obtained by the expression (15) 
as long as it is a function representing the development efficiency of branches such as "Ginicrrterion" described in the 
literature titled "Classification and Regression Trees'*. 

20 Thus, once a neuron or a set of neurons to be developed are decided, the branches are developed and leaves and 
nodes are generated accordingly. 

Lastly, when all neurons have been turned into leaves, the classification tree is completed. 
Fig. 35 shows an example of the classification tree which has been generated in the process F2903 for generating 
a classification tree and stored in the HDD 2803. 

25 In Frg. 35, the branches which have been deleted in S1002 are omitted. The circled branches in Fig. 35 indicate 
that they are leaves which have been assigned category numbers as free nodes in S1 005. 

All branches other than the leaves will turn into nodes; therefore, further branch development will be implemented. 
Fig. 35 shows the result of the further branch development related only to the third node from the right. 

In the third node from the right three types of categories, namely, "1", "7\ and "9" coexist, requiring the develop- 

30 ment of branches. It is assumed that the top right neuron of the first layer has been selected to be developed in the first 
layer as a result given by the development variables discriminator. Then, 2 4 = 1 6 branches are developed as is the case 
shown in Fig. 33 with respect to the state of the top right neuron, and some branches are deleted, some branches are 
turned into leaves, and some branches are turned into nodes. The branches which have turned into nodes must be fur- 
ther developed until the ends of all branches are eventually turned into leaves. 

35 In Fig. 35, for the purpose of clarity, the first layer and the second layer are superimposed to show the development 
result of the third node from the right Actually, these states are represented by the four neurons of the first layer and 
the four top right neurons of the second layer of the pyramid illustrated in Fig. 29. 

Fig. 36 shows the flow of recognizing an input pattern by using the classification tree generated using the proce- 
dure described above. In Fig. 36, reference character 3601 denotes an input pattern; 3602 denotes a sub-pattern 

40 extractor for extracting sub-patterns from the input pattern; 3603 denotes a pre-layering processor tor pyramidally lay- 
ering input sub-patterns; 3604 denotes layered sub-patterns resulting from the pyramidal layering process; 3605 
denotes a classification tree; 3606 denotes a category discriminator for determining the discrimination probability of cat- 
egories according to the layered input sub-patterns and the classification tree; and 3607 denotes a discrimination prob- 
ability integrator for integrating the discrimination probabilities of the respective categories obtained by the category 

45 discriminator. The inputs of this aspect of the present invention are input patterns and the outputs thereof are recogni- 
tion candidates. 

Preferably, the foregoing classification tree is the classification tree which can be generated in this embodiment 
The input pattern 3601 corresponds to the training pattern 2701 ; the substantial data configuration is the same 
although it is entered through an input device 3601. The sub-pattern extractor 3602 and the pre-layering processor 
so 3603 are exactly the same as those corresponding devices shown in Fig. 27. In the case shown in Fig. 27, there were 
as many layered input sub-patterns 3604 as the training patterns, while there is only one that is derived from an input 
pattern in this embodiment 

When a leaf is reached as the classification tree shown in Fig. 36 is traced according to the layered input sub-pat- 
terns 3604, the category discriminator causes a display or a printer to output the categories existing in the leaf at that 
55 point as a recognition result. 

If no leaf is reached, then the category probability included in the node passed through last is output as a result. 

The discrimination probability integrator 3607 determines an arithmetic mean, geometric mean, or other mean of 
the results of each sub-pattern given by the category discriminator 3606. 
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<In the case of strokes ) 

In this embodiment, the training data and the training data to be recognized, which are entered, are the stroke data 
entered through a pen or other type of coordinate input means. While the data involved in the example of images 
5 described above was handled as bit map data, the input strokes handled in this embodiment are divided and quantized 
into vectors in this embodiment The entire flow, however, is based on the that of the first embodiment. 

Fig. 37 shows a processing flowchart which illustrates the procedure for generating a classification tree in this 
embodiment. Reference character 3701 indicates a training stroke; 3702 denotes a stroke divider for dividing the train- 
ing stroke; 3703 denotes a vector generator for making the stroke segments vectors, the stroke segments having been 
10 produced by the stroke divider; 3704 denotes a sub-vector extractor for partially extracting sub-vectors from a vector 
series obtained by the vector generator; 3705 denotes a pre-layering processor for layering the vector series produced 
by the sub-vector extractor; 3706 denotes a layered sub-vector series produced by the pre-layer processor; 3707 
denotes a classification tree generator for generating a classification tree according to the layered vector series; 3708 
denotes a development vector discriminator used by the classification tree generator to generate a classification tree; 
is and 3709 denotes the classification tree generated by the classification tree generator. 

In this embodiment, the inputs are training strokes and the outputs are classification trees. 

Referring now to Fig. 37 to Fig. 39, a description will be given to the operation of this embodiment 

Three different characters 

20 "I" , and "o" 

which read B ku", "shi", and Isu", respectively, each of which is drawn in one stroke, will be taken as the examples rep- 
resenting categories to be recognized. 
25 It is assumed that there are one hundred training patterns each for 

n <", "l", and "-?", 

30 respectively, for generating a dictionary; these are denoted as follows: 

TPi j (Training Pattern i, j) 

where i is a suffix denoting the category and it takes a value in a range of 0 <> i ^ 2 and j is a suffix denoting a 
35 training pattern number and it takes a value in a range of 1 £ j s 1 00. 

As illustrated by the flowchart shown in Fig. 38, the method of generating the dictionary for the online handwritten 
character recognition according to the embodiment is composed of four steps, namely, a vector generation step, a sub- 
vector extraction step, a pre-layering process step, and a classification tree generation step. The vector generation step, 
the pre-layering process step, and the classification tree generation step are the same as those described in the first 
40 erribodiment in conjunction with Fig. 3; therefore, only the sub-vector extraction step will be described. 

(F38) Sub-vector extraction step 

Referring to Fig. 39, the sub-vector extraction step F38 will be described in detail. 
45 In Fig. 39, the stroke is equally divided into sixteen segments and converted to vectors of 5421 12455421 1245. 

The vector series composed of the sixteen vectors is partially extracted to form three groups of sub-vector series, 
each group being composed of an eight-vector series. 

The sub-vector series may be extracted as illustrated in Fig. 39 where they are overlapped, or they may be 
extracted such that they do not overlap at all. 
so The number of the vectors included in each sub-vector series is eight in Fig. 39; however, the number is not limited 
thereto. 

Fig. 40 shows a processing flowchart which illustrates the procedure for online handwritten character recognition. 
Reference character 4001 indicates a handwritten stroke entered by a user; 4002 denotes a stroke divider for dividing 
the handwritten stroke; 4003 denotes a vector generator for making the stroke segments vectors, the stroke segments 
55 having been produced by the stroke divider; 4004 denotes a sub-vector extractor for partially extracting vectors from a 
vector series obtained by the vector generator; 4005 denotes a pre-layering processor for layering the vector series pro- 
duced by the sub-vector extractor; 4006 denotes a layered sub-vector series produced by the pre-layering processor; 
4007 denotes a classification tree which provides the information necessary for category classification; 4008 denotes 
a category discriminator which determines the category of the handwritten stroke according the layered vector series 
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by referring to the classification tree; and 4009 denotes a discrimination probability integrator which integrates the dis- 
crimination probability of each category received from the category discriminator. In this embodiment, the inputs are 
handwritten strokes and the outputs are recognition candidates. Preferably, the foregoing classification tree is the clas- 
sification tree which can be generated in the foregoing example. 

5 The handwritten stroke 4001 corresponds to the training stroke 3701 ; it is substantially the same. The stroke divider 
4002, the vector generator 4003, the sub-vector extractor 4004, and the pre-layering processor 4005 are exactly the 
same as those corresponding devices shown in Fig. 37. In the case shown in Fig. 37, there were as many layered sub- 
vector series 3706 as the training patterns, while there is only one layered sub-vector series 4006 that is derived from 
the handwritten stroke in this example. 

w When a leaf is reached as the classification tree shown in Rg. 7 is traced according to the layered sub-vector series 
4006, the category discriminator 4008 causes the categories existing in the leaf at that point to be output as a recogni- 
tion result If no leaf is reached, then the category probability included in the node passed through last is output as a 
result 

The discrimination probability integrator 4009 determines an arithmetic mean, geometric mean, or other mean of 
15 the results of each sub-pattern received from the category discriminator 4008. 

(In the case of voices) 

It is also possible to recognize voices by using the classification tree generating procedure and the recognizing pro- 
20 cedure which have been described in the foregoing example for images. 

Voice data is expressed in terms of time series data which is subject to Fourier transformation to extract envelopes. 
The result is illustrated in Rg. 42 which shows an example of the result of the Fourier transformation of the voice data 
pronounced 

which reads "a-shi-ta-i-ku" meaning "wiH go tomonow". As may be seen from Rg. 42, unlike the binary bit map data, the 
processed voice data has analog value intensity and a three-dimensional shape with an undulated surface like a moun- 
30 tainranga 

The three-dimensional data is cut on predetermined axes and converted to N pieces of two-dimensional bit map 
data. This enables a classification tree to be generated by implementing the classification tree generating procedure for 
the bit map data described in the foregoing embodiment Input voices can be represented in terms of bit map data by 
the Fourier transformation and the cutting by predetermined axes, so that they can also be recognized. 
35 Fig. 43 illustrates the data of Rg. 42 which has been cut using intensity and frequency as the cutting axes; and Rg. 
44 illustrates the data of Rg. 42 which has been cut using frequency and time. 

The recognition result of the entire three-dimensional configuration like the one shown in Rg. 42 can be obtained 
by averaging (e.g. arithmetic averaging) the recognition results of N pieces of the two-dimensional bit maps as 
described above. 

40 The present invention relates to a classification generation method whereby, in order to efficiently and accurately 
recognize a pattern having a large number of characteristics a pattern classification tree is generated, with which a 
macro structural characteristic of a pattern is appropriately reflected and a competitive relationship between categories 
is adequately reflected, and to a method for recognizing an input pattern by using the generated classification tree. 
When an input pattern is formed using strokes, a training stroke is divided into a plurality of segments, and vector 

45 quantization is performed for the strokes in the segments. Among the quantized strokes in the segments, adjacent 
stroke sets are synthesized to repetitively generate upper rank stroke vectors. A stroke vector for which a predeter- 
mined entropy function is maximized is selected from the upper rank stroke vectors in a layered stroke vector series, 
and development is performed extending down into the lower rank stroke vector sets. As a result, a classification tree 
is prepared. 

so 

Claims 

1. An information processing method for generating a classification tree, which is a recognition dictionary used for 
character recognition, comprising: 

55 

a division step of dividing a predetermined training stroke into a plurality of segments; 

a vector quantization step of performing vector quantization of said strokes in said segments obtained at said 

division step; 

a layered stroke vector generation step of synthesizing adjacent strokes of said segments, obtained at said 
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division step, to obtain stroke sets to generate upper rank stroke vectors, and of producing a layered vector 
series; and 

a classification tree generation step of selecting a stroke vector, for which a predetermined entropy function is 
the greatest, from upper rank stroke vectors in said layered stroke vector series that is obtained at said layered 
s stroke vector generation step, and of developing said stroke vector to produce lower rank stroke vectors to gen- 

erate a classification tree. 

2. A method according to claim 1, wherein, at said vector quantization step, said strokes in said segments obtained 
at said division step are quantized to obtain vectors in eight directions, with intersecting angles formed by adjacent 

10 vectors being equal to each other. 

3. A method according to claim 1 , wherein, at said vector quantization step, said strokes in said segments obtained 
at said division step are quantized to obtain vectors in sixteen directions, with intersecting angles formed by adja- 
cent vectors being equal to each other. 

15 

4. A method according to claim 1, wherein said entropy function is a function whereby an entropy reducing value is 
output when information is obtained for a lower rank vector set of one of said upper stroke vectors of said layered 
stroke vector series, which is generated at said layered stroke vector generation step. 

20 5. A method according to claim 1, wherein at said classification tree generation step, rf a training stroke corresponding 
to said lower rank stroke vector set is not present, said lower rank stroke vector set for said classification tree is 
regarded as invalid. 

6. A method according to claim 5, wherein at said classification tree generation step, rf a training stroke for a single 
25 category that corresponds to said lower rank stroke vector set is present a number for said single category is 

attached to said lower rank stroke vector set. 

7. A method according to claim 6, wherein at said classification tree generation step, if a training stroke for a plurality 
of categories that corresponds to said lower rank stroke vector is present, an upper rank stroke vector with which 

30 said predetermined entropy function is maximized is selected from upper rank stroke vectors for said lower rank 
stroke vector set. 

a A method according to claim 1 , further comprising the step of recognizing a character using the generated classi- 
fication tree, said recognizing step including: 

35 

a division step of dividing an input stroke into a plurality of segments; 

a vector quantization step of performing vector quantization of strokes in said segments obtained at said divi- 
sion step; 

a layered stroke vector generation step of synthesizing adjacent strokes of said segments, obtained at said 
40 division step, to obtain stroke sets to generate upper rank stroke vectors, and of producing a layered vector 

series; and 

a recognition step of acquiring a recognition category by tracing said classification tree in order from said upper 
rank stroke vectors to lower rank stroke vectors in said layered stroke vector series, which is generated at said 
layered stroke vector generation step. 

45 

9. An information processing apparatus for generating a classification tree, which is a recognition dictionary used for 
character recognition, comprising: 

division means for dividing a predetermined training stroke into a plurality of segments; 
so vector quantization means for performing vector quantization of said strokes in said segments obtained by said 

division means; 

layered stroke vector generation means for synthesizing adjacent strokes of said segments, obtained by said 
division means, to obtain stroke sets to generate upper rank stroke vectors, and for producing a layered vector 
series; and 

55 classification tree generation means for selecting a stroke vector, for which a predetermined entropy function 

is the greatest, from upper rank stroke vectors in said layered stroke vector series that is obtained by said lay- 
ered stroke vector generation means, and for developing said stroke vector to produce lower rank stroke vec- 
tors to generate a classification tree. 



24 



EP0 784 285 A2 

10. An apparatus according to claim 9, wherein said vector quantization means quantizes said strokes in said seg- 
ments, obtained by said division means in order to acquire vectors in eight directions, with intersecting angles 
formed by adjacent vectors being equal to each other. 

5 11. An apparatus according to claim 9, wherein said vector quantization means quantizes said strokes in said seg- 
ments, obtained by said division means, in order to acquire vectors in sixteen directions, with intersecting angles 
formed by adjacent vectors being equal to each other. 

12. An apparatus according to claim 9, wherein said entropy function is a function whereby an entropy reducing value 
w is output when information is obtained for a lower rank vector set of one of said upper stroke vectors of said layered 

stroke vector series, which is generated by said layered stroke vector generation means. 

13. An apparatus according to claim 9, wherein, if a training stroke corresponding to said lower rank stroke vector set 
is not present, said classification tree generation means regards, as invalid, said lower rank stroke vector set for 

is said classification tree. 

14. An apparatus according to claim 13, wherein, if a training stroke for a single category that corresponds to said lower 
rank stroke vector set is present, said classification tree generation means attaches a number for said single cate- 
gory to said lower rank stroke vector set 

20 

1 5. An apparatus according to claim 1 4, wherein if a training stroke for a plurality of categories that corresponds to said 
lower rank stroke vector is present, said classification tree generation means selects an upper rank stroke vector, 
with which sad predetermined entropy function is maximized, from upper rank stroke vectors for said lower rank 
stroke vector set. 

25 

i 6. An apparatus according to claim 9, further comprising means for recognizing a character using the generated clas- 
sification tree, said recogniging means including: 

division means for dividing an input stroke into a plurality of segments; 
30 vector quantization means for performing vector quantization of strokes in said segments obtained by said divi- 

sion means; 

layered stroke vector generation means for synthesizing adjacent strokes of said segments/obtained by said 
division means, to obtain stroke sets to generate upper rank stroke vectors, and for producing a layered vector 
series; and 

35 recognition means for acquiring a recognition category by tracing said classification tree in order from said 

upper rank stroke vectors to lower rank stroke vectors in said layered stroke vector series, which is generated 
by said layered stroke vector generation means. 
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